{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "引用R包"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from rpy2.robjects.packages import importr\n",
    "from rpy2.robjects import FloatVector, StrVector, r\n",
    "import pandas as pd\n",
    "ci = importr(\"CausalImpact\")\n",
    "zoo = importr(\"zoo\")\n",
    "base = importr(\"base\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "加载game_churn的数据，并进行初步转换"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "gc_data = pd.read_csv(\"http://image.cador.cn/data/game_churn.csv\")\n",
    "gc_data.y = [x.split('%')[0] for x in gc_data.y.values]\n",
    "gc_data.orent = [x.split('%')[0] for x in gc_data.orent.values]\n",
    "time_points = base.seq_Date(base.as_Date('2019-04-01'), by=1, length_out=25)\n",
    "data = zoo.zoo(base.cbind(FloatVector(gc_data.y), FloatVector(gc_data.cpi), FloatVector(gc_data.orent)), time_points)\n",
    "pre_peroid = base.as_Date(StrVector(['2019-04-01', '2019-04-15']))\n",
    "post_peroid = base.as_Date(StrVector(['2019-04-16', '2019-04-25']))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "建立时序因果推断模型，并得出分析图表"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "    <span>IntVector with 1 elements.</span>\n",
       "    <table>\n",
       "      <tbody>\n",
       "      <tr>\n",
       "      \n",
       "      <td>\n",
       "        1\n",
       "      </td>\n",
       "      \n",
       "      </tr>\n",
       "      </tbody>\n",
       "    </table>\n",
       "    "
      ],
      "text/plain": [
       "R object with classes: ('integer',) mapped to:\n",
       "<IntVector - Python:0x0000000466426D48 / R:0x000000046779FC10>\n",
       "[1]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model_args = r(\"list(niter = 10000, nseasons = 7, season.duration = 1)\")\n",
    "impact = ci.CausalImpact(data, pre_peroid, post_peroid, model_args=model_args)\n",
    "r('png(\"data/causalimpact.jpg\")')\n",
    "print(ci.plot_CausalImpact(impact))\n",
    "r('dev.off()')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![](data/causalimpact.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "查看因果推断的报告"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "rpy2.rinterface.NULL"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "r('sink(\"data/report.txt\")')\n",
    "ci.PrintReport(impact)\n",
    "r('sink()')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Analysis report {CausalImpact}\n",
      "\n",
      "\n",
      "During the post-intervention period, the response variable had an average value of approx. 43.18. By contrast, in the absence of an intervention, we would have expected an average response of 47.37. The 95% interval of this counterfactual prediction is [45.28, 49.47]. Subtracting this prediction from the observed response yields an estimate of the causal effect the intervention had on the response variable. This effect is -4.18 with a 95% interval of [-6.29, -2.09]. For a discussion of the significance of this effect, see below.\n",
      "\n",
      "Summing up the individual data points during the post-intervention period (which can only sometimes be meaningfully interpreted), the response variable had an overall value of 431.84. By contrast, had the intervention not taken place, we would have expected a sum of 473.68. The 95% interval of this prediction is [452.78, 494.73].\n",
      "\n",
      "The above results are given in terms of absolute numbers. In relative terms, the response variable showed a decrease of -9%. The 95% interval of this percentage is [-13%, -4%].\n",
      "\n",
      "This means that the negative effect observed during the intervention period is statistically significant. If the experimenter had expected a positive effect, it is recommended to double-check whether anomalies in the control variables may have caused an overly optimistic expectation of what should have happened in the response variable in the absence of the intervention.\n",
      "\n",
      "The probability of obtaining this effect by chance is very small (Bayesian one-sided tail-area probability p = 0). This means the causal effect can be considered statistically significant. \n",
      "\n"
     ]
    }
   ],
   "source": [
    "with open(\"data/report.txt\", 'r') as f:\n",
    "    print(f.read())"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
